INFLATION OF FOOD AND ITS EFFECT ON DIFFERENT EXPENDITURE GROUPS IN TURKEY

Author

Yiğit Muzaffer YILDIZ

1. Project Overview and Scope

Every country has an inflation which might be either positive or negative. It is stated that positive inflation points to a country’s economic status regarding people’s power to buy and maintain their lives. For the last few years, Turkey has been one of the countries that have suffered from high inflation. The inflation word consists of many sub-categories, some of which are food, clothing, education, and transportation consumer price index (CPI). This project mainly focuses on the CPI of food and its effect on various groups of people with different total expenditure levels. To analyze the impact and infer, “Consumer price index (2003=100) according to main and sub-main groups, 2005-2025” and “Distribution of household consumption expenditure by quintiles ordered by expenditure, Türkiye, 2002-2023” data are driven from the website of TUIK, which is short for Turkey Statistics Institute.

2. Data

Show the code
#install.packages("readxl")
library(readxl)
library(dplyr)
library(ggplot2)
library(tidyverse)
library(simplermarkdown)
consumer_price_index_data <- read_xlsx((path = "consumer_price_index_according_to_groups.xlsx"), .name_repair = "unique_quiet")
comparison_of_consumption_types <- read_xlsx(path = "comparison_of_consumption_types_according_to_expenditure.xlsx")

2.1 General Information About Data

The data files used for in-depth analysis are derived from the TUIK website. The data named “Distribution of household consumption expenditure by quintiles ordered by expenditure, Türkiye, 2002-2023” consists of the distribution of consumption expenditure types of different expenditure groups. This data is also referred to as “comparison_of_consumption_types” and “COCT data” throughout the project. There are 5 groups in the columns named “First quantile”, “Second quantile”, “Third quantile”, “Fourth quantile”, and “Last quantile”. Each represents 20% of the people who are the subject of the data research in an ascending order of expenditure amount. Namely, the First quantile represents the 20% of the people who spend the least, while the Last quantile represents the people who spend the most. There are different expenditure types for the years in the rows.

COCT Data on TUIK Website

Another dataset used is named “Consumer price index (2003=100) according to main and sub-main groups, 2005-2025”, which is also referred to as “consumer_price_index_data” and “CPI data” throughout the project. This data shows the consumer price index value of 288 different main and sub-groups of expenditure according to each month of the years from 2005 to 2025. Year and month information is on the rows while the group names form the columns. Only the columns directly related to food are considered in the scope of this project.

CPI Data on TUIK Website

The years between 2005-2023 for both of the data sets are selected as the interval of the project.

2.2 Reason of Choice

The inflation of a country tells a lot about its economic status. It tells so many things that the food aspect might be underestimated. Nevertheless, access to food has been one of the major concerns of mankind. Over the decades, this concern has become more crucial for Turkish people, especially those with a low level of income and belonging to the first and second quantiles of expenditure level groups. The data chosen for this project helps the readers to navigate which group of people spends what percentage of their money on food and how this rate changes according to the inflation and consumer index of food. The conclusion of the project might shed light on the facts like a sign of socioeconomic differences among Turkish people.

2.3 Preprocessing

To begin with, the raw data which are in Excel (.xlsx) format driven from the website of TUIK are browsed as they are. Some of the rows and columns are deleted since they include text providing information about the data. Later, Turkish headings are removed from each row and column in the files. After a few operations in the Excel format of the files, they become ready to be imported to R. Both datasets are also saved in RData format to be processed in R.

Show the code
save(consumer_price_index_data, file = "consumer_price_index_data.RData")
save(comparison_of_consumption_types, file = "comparison_of_consumption_types.RData")
load("consumer_price_index_data.RData")
load("comparison_of_consumption_types.RData")

Downloadable COCT Data in .RData version

Downloadable CPI Data in .RData version

Show the code
nrow_CPI <- nrow(consumer_price_index_data)
ncol_CPI <- ncol(consumer_price_index_data)
nrow_COCT <- nrow(comparison_of_consumption_types)
ncol_COCT <- ncol(comparison_of_consumption_types)
summary_of_number_of_row_and_columns <- data.frame(
data_set_name = c("consumer_price_index_data","comparison_of_consumption_types"),
number_of_rows = c(nrow_CPI, nrow_COCT),
number_of_columns = c(ncol_CPI, ncol_COCT)
)

knitr::kable(summary_of_number_of_row_and_columns, caption = "Table 1: Dimensions of the Unprocessed Data")
Table 1: Dimensions of the Unprocessed Data
data_set_name number_of_rows number_of_columns
consumer_price_index_data 243 291
comparison_of_consumption_types 275 8

2.3.1 Comparison of Consumption Types (COCT) Data

All the spaces (” “) among the headings of the colmuns are replaced by”_” sign. All the year values in the first column are arranged so that all have only 4 digits. Moreover, only the rows having food and non-alcoholic beverages expenditure are kept. There is no data from the years 2020 and 2021, even though the data is named by 2002-2023. Besides, there are two types of expenditure categorization for 2022. The 2 rows belonging to 2022 are reduced to one row by taking their average. Finally, the class of Year column is turned into a numeric from character and only years between 2005-2023, excluding 2020 and 2021, are kept.

The final version of the data after the operations is as shown in Table 2:

Show the code
comparison_of_consumption_types <- comparison_of_consumption_types %>% 
  rename_with(~ gsub(" ","_", .x), contains(" ")) %>% #removing the space " " from the columns and replacing them with "_"
  mutate(Year = substr(Year, 1, 4)) %>% #rearranging the Year column
  filter(Total!=100) %>% #removing the rows having total consumption expenditure information 
  filter(Expenditure_Types=="Food and non-alcoholic beverages")

#Rearranging Year 2022 Columns
rows_to_merge <- c(19, 20) #the indexes of the rows with year 2022
merged_rows <- comparison_of_consumption_types[rows_to_merge, ] #selecting year 2022 rows
new_row <- comparison_of_consumption_types[1, ]  # starting to a new data frame

for (col in names(comparison_of_consumption_types)) { #creating a loop 
  if (is.numeric(comparison_of_consumption_types[[col]])) {
    new_row[[col]] <- mean(merged_rows[[col]], na.rm = TRUE)
  } else if (is.character(comparison_of_consumption_types[[col]]) || is.factor(comparison_of_consumption_types[[col]])) {
    values <- unique(as.character(merged_rows[[col]]))
    if (length(values) == 1) {
      new_row[[col]] <- values
    } else {
      new_row[[col]] <- NA
    }
  } else {
    new_row[[col]] <- NA
  }
}

comparison_of_consumption_types_updated <- comparison_of_consumption_types[-rows_to_merge, ] #deleting the initial two 2022 rows

new_position <- nrow(comparison_of_consumption_types_updated)  # position of the generated row

#rearranging the position of the generated row
COCT_final <- bind_rows(
  comparison_of_consumption_types_updated[1:(new_position - 1), ],
  new_row,
  comparison_of_consumption_types_updated[new_position:nrow(comparison_of_consumption_types_updated), ]
)

COCT_final$Year <- as.numeric(COCT_final$Year) #changing the class of Year column to numeric

COCT_final <- subset(COCT_final[, -2:-3], Year>=2005) #selecting the years starting from 2005

knitr::kable(COCT_final, caption = "Table 2: Change of Food Expenditure Percentage According to Quintiles by Years")
Table 2: Change of Food Expenditure Percentage According to Quintiles by Years
Year First_quintile Second_quintile Third_quintile Fourth_quintile Last_quintile
2005 40.63172 34.88343 30.04158 27.07178 16.74860
2006 39.47612 33.15669 29.56472 26.00507 17.51656
2007 37.30757 31.51043 27.88232 24.10312 16.89984
2008 36.44696 30.21248 26.02253 23.32254 16.28410
2009 34.25025 29.84173 26.33608 24.63697 16.95754
2010 32.52000 27.85000 26.37000 23.09000 15.83000
2011 31.69000 27.42000 24.82000 22.39000 14.61000
2012 31.34000 26.76000 24.05000 21.06000 13.53000
2013 30.42000 26.90000 24.76000 21.36000 13.67000
2014 30.12000 27.41000 23.86000 21.61000 13.60000
2015 31.69000 27.83000 25.07000 21.89000 13.96000
2016 30.93000 27.49000 24.23000 21.39000 13.08000
2017 30.73000 26.48000 24.88000 22.11000 13.35000
2018 30.89000 27.91000 25.63000 22.74000 13.55000
2019 33.36000 28.55000 25.54000 22.19000 14.26000
2022 39.24000 34.28500 30.34500 26.48000 14.19000
2023 39.18000 31.91000 27.60000 23.77000 12.52000

2.3.2 Consumer Price Index (CPI) Data

As an initial glance, there are too many columns as groups of expenditure, some of which are not the topic of this project. These columns are removed from the data. Only the rows including general food CPI values are kept. On the row side, there is year and month information. An additional row for each year is created as the average value of the months of the year. Thanks to this operation, the CPI value of food and how it changes can be evaluated not only by months but also by years. The month column and its relative rows are kept for further analysis.

The final version of the data after the operations is as shown in Table 3:

Show the code
#Generating rows containing average CPI value of each year
consumer_price_index_yearly <- consumer_price_index_data %>%
  group_by(Year) %>%
  summarise(across(where(is.numeric), mean, na.rm = TRUE)) %>%
  mutate(Months = "Average") %>%
  select(Year, Months, everything()) %>%
  subset(Year<=2023)
  
#head(consumer_price_index_data)

consumer_price_index_food <- consumer_price_index_yearly %>% select(Year | Months | General | contains("food")) #selecting the columns related to food

consumer_price_index_quality_food <- consumer_price_index_yearly %>% select(Year | Meat | Chicken | `Fish and seafood` | Bread | `Milk, cheese and eggs` | Fruit )

COCT_final$Year <- as.numeric(COCT_final$Year)
CPI_food_final <- consumer_price_index_food[, c(1,4)]
CPI_food_final <- rename(CPI_food_final, "CPI_of_food"="Food and non-alcoholic beverages")

knitr::kable(CPI_food_final, caption = "Table 3: Food CPI by Years")
Table 3: Food CPI by Years
Year CPI_of_food
2005 112.0787
2006 122.9450
2007 138.2108
2008 155.8842
2009 168.3875
2010 186.2000
2011 197.8150
2012 214.4567
2013 233.9725
2014 263.4925
2015 292.8617
2016 309.8108
2017 349.1550
2018 411.8717
2019 492.3342
2020 560.5175
2021 696.5883
2022 1293.1833
2023 2144.2950
Show the code
#consumer_price_index_data <- bind_rows(consumer_price_index_data, consumer_price_index_yearly)

3. Analysis

The analysis part of the project consists of 3 subheadings. There are fundamental explanations and demonstrations of datasets’ analysis. The plots comparing the CPI within years, expenditure group behaviors, and food expenditure rates relative to the food CPI values are used to visualize.

3.1 Exploratory Data Analysis

After the preprocessing phase, looking at the graphical reflection of the updated data is a good start for analysis. Initial plotting is done separately to have a general understanding of what the datasets say. As a general conjecture, people with the lowest income level naturally are the ones who spend the least. It is quite obvious that in Graph 1, the fewer people have money to spend less the greater the percentage of their money is spent on food. People who spend very little spend at least one-third of their money on food. This demonstrates that this part of society does not have much to spend on more than the fundamentals, like food and housing. Their major concern is to access food. 

Another crucial point is as follows. The food expenditure level has not been more than 15% of the total expenditure amount for the people belonging to the last quintile. However, this much expenditure might be so close to or even more than the other quintiles. This shows how the spending trend changes by the total expenditure of people.

Show the code
COCT_long <- COCT_final %>%
  pivot_longer(cols = -Year,
               names_to = "category",
               values_to = "value")

ggplot(COCT_long, aes(x = Year, y = value, color = category, group = category)) +
  geom_line(size = 1) +
  geom_point(size = 2) +
  xlab("Year") +
  ylab("Food Expenditure Percentage of Quintiles") +
  ggtitle("Graph 1: Food Expenditure Percentage by Expenditure Groups Over the Years") +
  theme_minimal()

The food CPI value has been increasing from 2005 with an increasing slope over the years, as shown in Graph 2.

Show the code
consumer_price_index_food %>% 
  filter(Months=="Average") %>%
  ggplot(aes(Year, `Food and non-alcoholic beverages`)) + geom_line(color="pink", linewidth=2) +
  xlab("Year") +
  ylab("CPI") +
  ggtitle("Graph 2 : Change in Food CPI Over the Years") +
  theme_minimal()

On the other hand, the quality food CPI shows the same behavior. Here, high quality phrase is used for food containing high protein like meat, which is hard to access for people with a low income level.

Show the code
df_long <- consumer_price_index_quality_food %>%
  pivot_longer(cols = -Year, names_to = "Product", values_to = "CPI")

ggplot(df_long, aes(x = Year, y = CPI, color = Product)) +
  geom_line(size = 1.2) +
  geom_point(size = 2) +
  scale_color_brewer(palette = "Set2") +
  xlab("Year") +
  ylab("CPI") +
  ggtitle("Graph 3 : Change in High Nutritious Food CPI Over the Years") 

3.2 Multiple Linear Regression

To analyze the impact of a change in the CPI of food, the multiple linear regression (MLR) technique is used. The reason why MLR is preferred rather than linear regression is that there are different expenditure groups in this study, and it is aimed to see the influence of CPI change on these groups separately.

The two processed datasets are merged before starting the analysis. After investigating the output below, the value at the intercept of CPI_of_food and Estimate, which is 0.00094, shows that people in the first quintile spend their money with more percentage as CPI increases. On the other hand, other coefficient values, which are all negative, are statistical evidence that other quintiles spend a smaller percentage of their money on food. Different outputs also show that all quintile groups have different characteristics and react differently to changes in CPI.

Show the code
merged_data <- left_join(COCT_final, CPI_food_final, by = "Year") #merging two processed data

#Multiple Linear Regression
df_long <- pivot_longer(merged_data, cols = starts_with("First"):starts_with("Last"),
                        names_to = "Quintile", values_to = "Food_Spending")

model_multi <- lm(Food_Spending ~ CPI_of_food + Quintile, data = df_long)
#summary(model_multi)
model_multi

Call:
lm(formula = Food_Spending ~ CPI_of_food + Quintile, data = df_long)

Coefficients:
            (Intercept)              CPI_of_food  QuintileFourth_quintile  
              33.737206                 0.000944               -10.882537  
  QuintileLast_quintile  QuintileSecond_quintile   QuintileThird_quintile  
             -19.392117                -4.695462                -7.836493  

3.3 Trend and Elasticity Analysis

In Graph 4, the slope of the red lines represents the reaction of quantiles to the change in CPI. There are big gaps in the graphs. The reason why there are big gaps in the graphs is that after 2020, when the food CPI value is around 560, the values increase dramatically due to the country’s political and economic status. It starts to nearly double each year after 2021.

Show the code
ggplot(df_long, aes(x = CPI_of_food, y = Food_Spending)) +
  geom_point(color = "blue", size = 2) +
  geom_smooth(method = "lm", color = "red", se = FALSE, size = 1) +
  facet_wrap(~ Quintile, ncol = 2) +
  labs(title = "Graph 4: The Correlation Between Food Expenditure Percentage and Food CPI",
       x = "Food CPI", y = "Food Expenditure Percentage") +
  theme_bw()

Show the code
df_long_2 <- df_long %>%
  mutate(Spending_per_CPI = Food_Spending / CPI_of_food)

ggplot(df_long_2, aes(x = Year, y = Spending_per_CPI, color = Quintile)) +
  geom_line(size = 1) +
  labs(title = "Graph 5: Normalized Expenditure Ratios by Food CPI",
       x = "Year", y = "Expenditure Ratio / Food CPI") +
  theme_classic()

In addition, it can be observed that the ratio of food expenditure/food CPI ratio of quantiles gets closer to each other over the years according to Graph 5. What this means is that the decrease in the ratios points lower ability to purchase. Moreover, the first few quantiles have a much more dramatic decrease in their ratio. This is a sign of their more intense vulnerability against the rise of food CPI.

How food expenditure percentage changes by a 1% increase in CPI of food is a measure of the elasticity of the expenditure percentage. After conducting the analysis, it is observed that the groups have elasticity around 0, slightly more or less. This tells that any expenditure group cannot directly respond to the change of CPI since there is no substitution of food. Moreover, the first quintile seems to be the one that suffers the most from the increase in CPI of food, according to the results obtained so far. Nevertheless, the response to the change is similar for different expenditure levels of groups in Turkey.

Show the code
df_merged <- merged_data %>%
  mutate(log_CPI = log(CPI_of_food),
         log_First = log(First_quintile),
         log_Second = log(Second_quintile),
         log_Third = log(Third_quintile),
         log_Fourth = log(Fourth_quintile),
         log_Last = log(Last_quintile))

elasticity_model_first_quintile <- lm(log_First ~ log_CPI, data = df_merged)
elasticity_model_second_quintile <- lm(log_Second ~ log_CPI, data = df_merged)
elasticity_model_third_quintile <- lm(log_Third ~ log_CPI, data = df_merged)
elasticity_model_fourth_quintile <- lm(log_Fourth ~ log_CPI, data = df_merged)
elasticity_model_fifth_quintile <- lm(log_Last ~ log_CPI, data = df_merged)

elasticity_value_first <- coef(elasticity_model_first_quintile)["log_CPI"]
elasticity_value_second <- coef(elasticity_model_second_quintile)["log_CPI"]
elasticity_value_third <- coef(elasticity_model_third_quintile)["log_CPI"]
elasticity_value_fourth <- coef(elasticity_model_fourth_quintile)["log_CPI"]
elasticity_value_fifth <- coef(elasticity_model_fifth_quintile)["log_CPI"]
cat("Food CPI Elasticity of Food Expenditure (First_quintile):", round(elasticity_value_first, 3), "\n")
Food CPI Elasticity of Food Expenditure (First_quintile): 0.012 
Show the code
cat("Food CPI Elasticity of Food Expenditure (Second_quintile):", round(elasticity_value_second, 3), "\n")
Food CPI Elasticity of Food Expenditure (Second_quintile): 0.01 
Show the code
cat("Food CPI Elasticity of Food Expenditure (Third_quintile):", round(elasticity_value_third, 3), "\n")
Food CPI Elasticity of Food Expenditure (Third_quintile): 0.007 
Show the code
cat("Food CPI Elasticity of Food Expenditure (Fourth_quintile):", round(elasticity_value_fourth, 3), "\n")
Food CPI Elasticity of Food Expenditure (Fourth_quintile): -0.004 
Show the code
cat("Food CPI Elasticity of Food Expenditure (Last_quintile):", round(elasticity_value_fifth, 3), "\n")
Food CPI Elasticity of Food Expenditure (Last_quintile): -0.095 

3.4 Expenditure Percentage Prediction for 2024-2025

To predict the future food expenditure percentages, regression analysis and the predict function of R are used. The CPI values for food are derived from the raw data, where rows from 2024 and 2025 were removed at the beginning of the project. According to these CPI values, the expenditure percentage for food is predicted for each quintile. The results are shown in Table 4.

Show the code
#prediction models for each quintile according values of food CPI of 2024 and 2025 obtained from raw data
model_for_first <- lm(First_quintile ~ CPI_of_food, data = merged_data) 
prediction_for_first_2024 <- predict(model_for_first, newdata = data.frame(CPI_of_food = 3364.88))
prediction_for_first_2025 <- predict(model_for_first, newdata = data.frame(CPI_of_food = 4141.51))

model_for_second <- lm(Second_quintile ~ CPI_of_food, data = merged_data) 
prediction_for_second_2024 <- predict(model_for_second, newdata = data.frame(CPI_of_food = 3364.88))
prediction_for_second_2025 <- predict(model_for_second, newdata = data.frame(CPI_of_food = 4141.51))

model_for_third <- lm(Third_quintile ~ CPI_of_food, data = merged_data) 
prediction_for_third_2024 <- predict(model_for_third, newdata = data.frame(CPI_of_food = 3364.88))
prediction_for_third_2025 <- predict(model_for_third, newdata = data.frame(CPI_of_food = 4141.51))

model_for_fourth <- lm(Fourth_quintile ~ CPI_of_food, data = merged_data) 
prediction_for_fourth_2024 <- predict(model_for_fourth, newdata = data.frame(CPI_of_food = 3364.88))
prediction_for_fourth_2025 <- predict(model_for_fourth, newdata = data.frame(CPI_of_food = 4141.51))

model_for_last <- lm(Last_quintile ~ CPI_of_food, data = merged_data) 
prediction_for_fifth_2024 <- predict(model_for_last, newdata = data.frame(CPI_of_food = 3364.88))
prediction_for_fifth_2025 <- predict(model_for_last, newdata = data.frame(CPI_of_food = 4141.51))

prediction_of_COCT <- data_frame(Year=c(2024, 2025),
                                 First_quintile=c(prediction_for_first_2024, prediction_for_first_2025),
                                 Second_quintile=c(prediction_for_second_2024, prediction_for_second_2025),
                                 Third_quintile=c(prediction_for_third_2024, prediction_for_third_2025),
                                 Fourth_quintile=c(prediction_for_fourth_2024, prediction_for_fourth_2025),
                                 Last_quintile=c(prediction_for_fifth_2024, prediction_for_fifth_2025))

knitr::kable(prediction_of_COCT, caption = "Table 4: Expenditure Percentage Prediction of Quintiles (2024-2025)")
Table 4: Expenditure Percentage Prediction of Quintiles (2024-2025)
Year First_quintile Second_quintile Third_quintile Fourth_quintile Last_quintile
2024 42.03211 34.52911 29.76349 25.12559 10.31143
2025 44.11366 35.87105 30.67744 25.62017 9.14512

After taking a look a Table 4 and plotting Graph 6 of updated COCT data, we can see that predictions for 2024 and 2025 are quite straightforward. However, it gives an idea of how and in which way the change occurs. In this case, food expenditure percentage increases for all the quintiles except the last (fifth) quintile. It might be a correct prediction at least in terms of the direction of the graph. There is no data to validate this in the CPI dataset.

Show the code
COCT_final_new_years <- rbind(COCT_final, prediction_of_COCT)

COCT_long_new_years <- COCT_final_new_years %>%
  pivot_longer(cols = -Year,
               names_to = "category",
               values_to = "value")

ggplot(COCT_long_new_years, aes(x = Year, y = value, color = category, group = category)) +
  geom_line(size = 1) +
  geom_point(size = 2) +
  xlab("Year") +
  ylab("Food Expenditure Percentage of Quintiles") +
  ggtitle("Graph 6: Food Expenditure Percentage by Expenditure Groups Over the Years with Prediction for 2024-2025") +
  theme_minimal()

4. Results and Key Takeaways

This study tried to shed light on the food aspect of inflation’s impact on people’s expenditure attitudes with different expenditure levels. Two datasets are used for analysis, visualization, and inference.Not just high levels of inflation but also the concern about the effort to access food is one of the major and recent agendas in Turkey. It is not all about the numbers since inflation does not affect every person in the same way. However, it has a crucial impact on everyone. 

This study created the main key takeaways as listed in the following:

  • People spending the least have to spend a larger percentage of their money on food than others. For the last 20 years, more than one-third of their money has been spent on food.
  • An increase in CPI of food leads to an increase in the percentage spent on food of the first quintile the most. Namely, it affects the lowest income group the most.
  • Any expenditure group cannot directly respond to the change in CPI since there is no substitution of food. Therefore, their food expenditure elasticity is very low.

Even though it is known that Turkey suffers from high consumer price index values and relatively high inflation rates, a deeper resolution of the impact should be investigated. The study aims to encourage the authorities to take a deep look at the impact on people’s lives and take action accordingly. Isn’t this what the government must do for the citizens after all?

Back to top